Merging data frames
Problem
You want to merge two data frames on a given column from each (like a join in SQL).
Solution
# Make a data frame mapping story numbers to titles stories <- read.table(header=T, con <- textConnection(' storyid title 1 lions 2 tigers 3 bears ')) close(con) # Make another data frame with the data and story numbers (no titles) data <- read.table(header=T, con <- textConnection(' subject storyid rating 1 1 6.7 1 2 4.5 1 3 3.7 2 2 3.3 2 3 4.1 2 1 5.2 ')) close(con) # Merge the two data frames merge(stories, data, "storyid") # storyid subject rating title # 1 1 6.7 lions # 1 2 5.2 lions # 2 1 4.5 tigers # 2 2 3.3 tigers # 3 1 3.7 bears # 3 2 4.1 bears
If the two data frames have different names for the columns you want to match on, the names can be specified:
# In this case, the column is named 'id' instead of storyid stories2 <- read.table(header=T, con <- textConnection(' id title 1 lions 2 tigers 3 bears ')) close(con) # Merge on stories2$id and data$storyid. merge(x=stories2, y=data, by.x="id", by.y="storyid") # id title subject rating # 1 lions 1 6.7 # 1 lions 2 5.2 # 2 tigers 1 4.5 # 2 tigers 2 3.3 # 3 bears 1 3.7 # 3 bears 2 4.1 # Note that the column name is inherited from the first data frame (x=stories2).
It is possible to merge on multiple columns:
# Make up more data animals <- read.table(header=T, con <- textConnection(' size type name small cat lynx big cat tiger small dog chihuahua big dog "great dane" ')) close(con) observations <- read.table(header=T, con <- textConnection(' number size type 1 big cat 2 small dog 3 small dog 4 big dog ')) close(con) merge(observations, animals, c("size","type")) # size type number name # big cat 1 tiger # big dog 4 great dane # small dog 2 chihuahua # small dog 3 chihuahua
Notes
After merging, it may be useful to change the order of the columns. See ../Reordering the columns in a data frame.